NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

CO2-Meter: A Comprehensive Carbon Footprint Estimator for LLMs on Edge Devices

Fu, Zhenxiao; Chen, Fan; Jiang, Lei (December 2025, Association for the Advancement of Artificial Intelligence)

Free, publicly-accessible full text available December 1, 2026
LLMCO2: Advancing Accurate Carbon Footprint Prediction for LLM Inferences

https://doi.org/10.1145/3757892.3757901

Fu, Zhenxiao; Chen, Fan; Zhou, Shan; Li, Haitong; Jiang, Lei (July 2025, ACM SIGEnergy Energy Informatics Review)

Throughout its lifecycle, an LLM incurs significantly higher carbon emissions during inference than training. Inference requests vary in batch size, prompt length, and token generation, while cloud providers deploy heterogeneous GPU configurations to meet diverse service-level objectives. Unlike training, inference exhibits lower and highly variable hardware utilization, making equation-based carbon models unreliable. Existing network-based estimators lack accuracy, as they fail to account for the distinct prefill and decode phases, hardware-specific features, and realistic request distributions. We propose LLMCO₂, a graph neural network (GNN)-based model, to improve the accuracy of LLM inference carbon footprint estimation by ~ 67% over prior approaches. Source code is available at https://github.com/fuzhenxiao/LLMCO₂.
more » « less
Free, publicly-accessible full text available July 1, 2026
Quantum Neural Network Extraction Attack via Split Co-Teaching

https://doi.org/10.1109/ICASSPW65056.2025.11011056

Fu, Zhenxiao; Chen, Fan (April 2025, IEEE)

Free, publicly-accessible full text available April 6, 2026
LSTM-QGAN: Scalable NISQ Generative Adversarial Network

https://doi.org/10.1109/ICASSP49660.2025.10888847

Chu, Cheng; Hastak, Aishwarya; Chen, Fan (April 2025, IEEE)

Free, publicly-accessible full text available April 6, 2026
Unified algorithms for RL with Decision-Estimation Coefficients: PAC, reward-free, preference-based learning and beyond

https://doi.org/10.1214/24-AOS2483

Chen, Fan; Mei, Song; Bai, Yu (February 2025, The Annals of Statistics)

Free, publicly-accessible full text available February 1, 2026
Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

Chen, Fan; Foster, Dylan; Han, Yanjun; Qian, Jian; Rakhlin, Alexander; Xu, Yunbei (December 2024, Advances in Neural Information Processing Systems)

Full Text Available
Special Session: End-To-End Carbon Footprint Assessment and Modeling of Deep Learning

https://doi.org/10.1109/CODES-ISSS60120.2024.00011

Faiz, Ahmad; Jiang, Lei; Chen, Fan (September 2024, IEEE International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS))

Full Text Available
System Support for Environmentally Sustainable Computing in Data Centers

https://doi.org/10.1109/ISVLSI61997.2024.00094

Chen, Fan (July 2024, 2024 IEEE Computer Society Annual Symposium on VLSI (ISVLSI))

Full Text Available
TITAN: A Fast and Distributed Large-Scale Trapped-Ion NISQ Computer

https://doi.org/10.1145/3649329.3655908

Chu, Cheng; Fu, Zhenxiao; Xu, Yilun; Huang, Gang; Muller, Hausi; Chen, Fan; Jiang, Lei (November 2024, 61st ACM/IEEE Design Automation Conference (DAC))

Full Text Available
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

Li, Jiachen; Wang, Xinyao; Zhu, Sijie; Kuo, Chia-Wen; Xu, Lu; Chen, Fan; Jain, Jitesh; Shi, Humphrey; Wen, Longyin (December 2024, NeurIPS 2024)

Recent advancements in Multimodal Large Language Models (LLMs) have focused primarily on scaling by increasing text-image pair data and enhancing LLMs to improve performance on multimodal tasks. However, these scaling approaches are computationally expensive and overlook the significance of efficiently improving model capabilities from the vision side. Inspired by the successful applications of Mixture-of-Experts (MoE) in LLMs, which improves model scalability during training while keeping inference costs similar to those of smaller models, we propose CuMo, which incorporates Co-upcycled Top-K sparsely-gated Mixtureof-experts blocks into both the vision encoder and the MLP connector, thereby enhancing the multimodal LLMs with neglectable additional activated parameters during inference. CuMo first pre-trains the MLP blocks and then initializes each expert in the MoE block from the pre-trained MLP block during the visual instruction tuning stage, with auxiliary losses to ensure a balanced loading of experts. CuMo outperforms state-of-the-art multimodal LLMs across various VQA and visual-instruction-following benchmarks within each model size group, all while training exclusively on open-sourced datasets.
more » « less
Full Text Available

« Prev Next »

Search for: All records